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We consider a multi-organizational system in which each organization contributes processors to the 
global pool but also jobs to be processed on the common resources. The fairness of the scheduling 
■ algorithm is essential for the stability and even for the existence of such systems (as organizations may 

refuse to join an unfair system). 

We consider on-line, non-clairvoyant scheduling of sequential jobs. The started jobs cannot be 
stopped, canceled, preempted, or moved to other processors. We consider identical processors, but 
most of our results can be extended to related or unrelated processors. 
Q^) \ We model the fair scheduling problem as a cooperative game and we use the Shapley value to de- 

termine the ideal fair schedule. In contrast to the current literature, we do not use money to assess the 
relative utilities of jobs. Instead, to calculate the contribution of an organization, we determine how the 
q ■ presence of this organization influences the performance of other organizations. Our approach can be 

used with arbitrary utility function (e.g., flow time, tardiness, resource utilization), but we argue that 
the utility function should be strategy resilient. The organizations should be discouraged from splitting, 
merging or delaying their jobs. We present the unique (to within a multiplicative and additive constants) 
QO ■ strategy resilient utility function. 

We show that the problem of fair scheduling is NP-hard and hard to approximate. However, for unit- 
size jobs, we present a fully polynomial-time randomized approximation scheme (FPRAS). Also, we 
show that the problem parametrized with the number of organizations is fixed parameter tractable (FPT). 
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Although for the large number of the organizations the problem is computationally hard, the presented 
exponential algorithm can be used as a fairness benchmark. We experimentally assess two heuristics and 
show that they produce more fair schedules than the round-robin algorithm. 
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1 Introduction 

In multi-organizational systems, participating organizations give access to their local resources; in return 
their loads can be processed on other resources. The examples of such systems include PlanetLatQ, grids 
(Grid500(H EGEEJ3) or organizationally distributed storage systems lUTI . There are a few incentives for 
federating into consortia: the possibility of decreasing the costs of management and maintenance (one 
large system can be managed more efficiently than several smaller ones), but also the willingness to utilize 
resources more efficiently. Peak loads can be offloaded to remote resources. Moreover, organizations can 
access specialized resources or the whole platform (which permits e.g. testing on a large scale). 
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In the multi-organizational and multi-user systems fairness of the resource allocation mechanisms is 
equally important as its efficiency. Efficiency of BitTorrent depends on users' collaboration, which in turn 
requires the available download bandwidth to be distributed fairly [33 ]. Fairness has been also discussed in 
storage systems H4I15I16I18I40I41I45 I and computer networks [42 ]. In scheduling, for instance, a significant 
part of the description of Maui [20], perhaps the most common cluster scheduler, focuses on the fair-share 
mechanism. Nevertheless there is no universal agreement on the meaning of fairness; next, we review 
approaches most commonly used in literature: distributive fairness and game theory. 

In distributive fairness organizations are ensured a fraction of the resources according to predefined 
(given) shares. The share of an organization may depend on the perceived importance of the workload, pay- 
ments B4I15I16I40I1 : or calculated to satisfy (predefined) service level agreements HI 81221451 . The literature 
on distributive fairness describes algorithms distributing resources according to the given shares, but does 
not describe how the shares should be set. In scheduling, distributive fairness is implemented through fair 
queuing mechanism: YFQ [1], SFQ and FSFQ II 11211 . or their modifications 1141 1 5 1 1 6 1 1 8 14014 1 145 14611 . 

A different approach is to optimize directly the performance (the utility) of users, rather than just the 
allocated resources. ll24l proposes an axiomatic characterization of fairness based on multi-objective opti- 
mization; [35 ] applies this concept to scheduling in a multi-organizational system. Inoie et al. [19] proposes 
a similar approach for load balancing: a fair solution must be Pareto-optimal and the revenues of the players 
must be proportional to the revenues in Nash equilibrium. 

While distributive fairness might be justified in case of centrally-managed systems (e.g. Amazon EC2 
or a single HPC center), in our opinion it is inappropriate for consortia (e.g., PlanetLab or non-commercial 
scientific systems like Grid5000 or EGEE) in which there is no single "owner" and the participating orga- 
nizations may take actions (e.g. rescheduling jobs on their resources, adding local resources, or isolating 
into subsystems). In case of such systems the shares of the participating organizations should depend both 
on their workload and on the owned resources; intuitively an organization that contributes many "useful" 
machines should be favored; similarly an organization that has only a few jobs. 

Game theory is an established method for describing outcomes of decisions made by agents. If agents 
may form binding agreements, cooperative game theory studies the stability of resulting agreements (coali- 
tions and revenues). There are well studied concepts of stability 11311 . like the core, the kernel, the nucleolus, 
the stable set or the bargaining set. The Shapley value 061 characterizes what is a. fair distribution of the 
total revenue of the coalition between the participating agents. 

The Shapley value has been used in scheduling theory but all the models we are aware of use the concept 
of money. The works of Carroll et at. 0J, Mishra et al. [28 1 , Mashayekhy and Grosu ET! and Moulin et 
al. 1291 describe algorithms and the process of forming the coalitions for scheduling. These works assume 
that each job has a certain monetary value for the issuing organization and each organization has its initial 
monetary budget. 

Money may have negative consequences on the stakeholders of resource-sharing consortia. Using (or 
even mentioning) money discourages people from cooperating ||39l . This stays in sharp contrast with the 
idea behind the academic systems — sharing the infrastructure is a step towards closer cooperation. Addi- 
tionaly, we believe that using money is inconvenient in non-academic systems as well. In many contexts, it 
is not clear how to valuate the completion of the job or the usage of a resource (especially when workload 
changes dynamically). We think that the accurate valuation is equally important (and perhaps equally diffi- 
cult) as the initial problem of fair scheduling. Although auctions or commodity markets [23 1 have been 
proposed to set prices, these approaches implicitly require to set the reference value to determine profitabil- 
ity. Other works on monetary game-theoretical models for scheduling include H9I10I13I14I32II : monetary 
approach is also used for other resource allocation problems, e.g. network bandwidth allocation II441 . How- 
ever, none of these works describes how to valuate jobs and resources. 

In a non-monetary approach proposed by Dutot el al. J6] the jobs are scheduled to minimize the global 
performance metric (the makespan) with an additional requirement — the utility of each player cannot be 
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worse than if the player would act alone. Such approach ensures the stability of the system against actions of 
any single user (it is not profitable for the user to leave the system and to act alone) but not to the formation 
of sub-coalitions. 

In the selfish job model [38] the agents are the jobs that selfishly choose processors on which to execute. 
Similarly to our model the resources are shared and treated as common good; however, no agent contributes 
resources. 

An alternative to scheduling is to allow jobs to share resources concurrently. In congestion games E 
130134 1 the utility of the player using a resource R depends on the number of the players concurrently using 
R; the players are acting selfishly. Congestion games for divisible load scheduling were analyzed by lfl2l 
and 071 . 

In this paper we propose fair scheduling algorithms for systems composed of multiple organizations 
(in contrast to the case of multiple organizations using a system owned by a single entity). We model 
the organizations, their machines and their jobs as a cooperative game. In this game we do not use the 
concept of money. When measuring the contribution of the organization O we analyze how the presence 
of O in the grand coalition influences the completion times of the jobs of all participating organization. 
This contribution is expressed in the same units as the utility of the organization. In the design of the fair 
algorithm we use the concept of Shapley value. In contrast to simple cooperative game, in our case the value 
of the coalition (the total utility of the organizations in this coalition) depends on the underlying scheduling 
algorithm. This makes the problem of calculating the contributions of the organizations more involved. 
First we develop algorithms for arbitrary utilities (e.g. resource utilization, tardiness, flow time, etc.). Next 
we argue that designing the scheduling mechanism itself is not enough; we show that the utility function 
must be chosen to discourage organizations from manipulating their workloads (e.g. merging or spliting 
the jobs — similar ideas have been proposed for the money-based models ll29l ). We present an exponential 
scheduling algorithm for the strategy resilient utility function. We show that the fair scheduling problem is 
NP-hard and difficult to approximate. For a simpler case, when all the jobs are unit-size, we present a fully 
polynomial-time randomized approximation scheme (FPRAS). According to our experiments this algorithm 
is close to the optimum when used as a heuristics for workloads with different sizes of the jobs. 

Our contribution is the following: (i) We derive the definition of the fair algorithm from the cooperative 
game theory axioms (Definitions 13. II and 13.21 Algorithm Q] and Theorem l39l). The algorithm uses only the 
notions regarding the performance of the system (no money-based mechanisms), (ii) We present the axioms 
(Section @]) and the definition of the fair utility function (Theorem 14.11 ) — this function is similar to the 
flow time metric but the differences make it strategy -resilient (Proposition 14.2b - (iii) We show that the fair 
scheduling problem is NP-complete (Theorem 15.11 ) and hard to approximate (Theorem I5.31 l. However, the 
problem parametrized with the number of organizations is fixed parameter tractable (FPT). (iv) We present 
an FPRAS for a special case with unit-size jobs (Algorithm [3] Theorems 15.61 and I5.71 ). The experiments 
show that this algorithm used as a heuristic for the general case is close to the optimum. 



2 Preliminaries 

Organizations, machines, jobs. We consider a system built by a set of independent organizations O = 
{OW, 0^ 2 \ . . . O^}. Each organization owns a computational cluster consisting of m^ u > machines 
(processors) denoted as Al[ u \ M^ w \ . . . M and produces its jobs, denoted as j[ u \ jf} > • • • • Each job 

has release time rj € T, where T is a discrete set of time moments. We consider an on-line problem 
in which each job is unknown until its release time. We consider a non-clairvoyant model i.e., the job's 
processing time is unknown until the job completes (hence we do not need to use imprecise [25 ] run-time 
estimates). For the sake of simplicity of the presentation we assume that machines are identical, i.e. each job 

can be executed at any machine and its processing always takes pf 1 ' time units; p^ is the processing 
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time. Most of the results, however, can be extended to the case of related machines, where p\ is a function 
of the schedule - the only exception make the results in Section 1570 where we rely on the assumption that 
each job processed on any machine takes exactly one time unit. The results even generalize to the case of 
unrelated machines, however if we assume non-clairvoyant model with unrelated machines (i.e., we do not 
know the processing times of the jobs on any machine) then we cannot optimize the assignment of jobs to 
machines. 

The jobs are sequential (this is a standard assumption in many scheduling models and, particularly, in 
the selfish job model ll38l : an alternative is to consider the parallel jobs, which we plan to do in the future). 
Once a job is started, the scheduler cannot preempt it or migrate it to other machine (this assumption is usual 
in HPC scheduling because of high migration costs). Finally, we assume that the jobs of each individual 
organization should be started in the order in which they are presented. This allows organizations to have an 
internal prioritization of their jobs. 

Cooperation, schedules. Organizations can cooperate and share their infrastructure; in such case we 
say that organizations form a coalition. Formally, a coalition C is a subset of the set of all organizations, 
C C O. We also consider a specific coalition consisting of all organizations, which we call a grand coalition 
and denote as C g (formally, C g = O, but in some contexts we use the notation C g to emphasize that we 
are referring to the set of the organizations that cooperate). The coalition must agree on the schedule a = 
U(«) \Ji{(4 U) > 8 i U) > M (4 U) ))} which is a set of triples; a triple {j\ u \ sf \ M{j\ u) )) denotes a job j[ u) 
started at time moment > rj^ on machine M(J^ u) ). We assume that a machine executes at most 
one job at any time moment. We often identify a job jf 1 ' with a pair (s^,pj u ^); and a schedule with 

U(u) Uj{( s i iPi ))} ( we do so for a more compact presentation of our results). The coalition uses all the 
machines of its participants and schedules consecutive tasks on available machines. We consider only greedy 
schedules: at any time moment if there is a free processor and a non-empty set of ready, but not scheduled 
jobs, some job must be assigned to the free processor. Since we do not know neither the characteristics 
of the future workload nor the duration of the started but not yet completed jobs, any non-greedy policy 
would result in unnecessary delays in processing jobs. Also, such greedy policies are used in real-world 
schedulers EOl . 

Let 3 denote the set of all possible sets of the jobs. An online scheduling algorithm (in short a scheduling 
algorithm) A : 3 x T — > O is an online algorithm that continuously builds a schedule: for a given time 
moment t € T such that there is a free machine in t and a set of jobs released before t but not yet scheduled: 
J £ 3, A(J, t) returns the organization the task of which should be started. The set of all possible schedules 
produced by such algorithms is the set of feasible schedules and denoted by T. We recall that in each feasible 
schedule the tasks of a single organization are started in a FIFO order. 

Objectives. We consider a utility function tp: TxOxT^-R that for a given schedule a € T, an 
organization 0^ u \ and a time moment t gives the value corresponding to the organization's satisfaction 
from a schedule a until t. The examples of such utility functions that are common in scheduling theory are: 
flow time, resource utilization, turnaround, etc. Our scheduling algorithms will only use the notions of the 
utilities and do not require any external payments. 

Since a schedule a is fully determined by a scheduling algorithm A and a coalition of organizations C, 
we often identify ip(A,C,0^ u \t) with appropriate i()(a,0^ u ' ,t). Also, we use a shorter notation ^ U \C) 
instead of ip(A, C, , t) whenever the A and t are known from the context. We define the characteristic 
function v : T x T — >• R describing the total utility of the organizations from a schedule: v(A,C,t) = 
Ylo^&c V'(^) C, , t). Analogously as above, we can use an equivalent formulation: v(a, t) = 
^2 (u) eC 4>(o~, ()( u \t), also using a shorter notations v(C) whenever it is possible. Note that the utilities of 
the organizations ip^ (C) constitute a division of the value of the coalition v(C). 
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3 Fair scheduling based on the Shapley value 



In this section our goal is to find a scheduling algorithm A that in each time moment t ensures a fair 
distribution of the value of the coalition v(C) between the participating organizations. We will denote 
this desired fair division of the value v as <f>^(v), (^^(v), . . . , (j>^ k \v) meaning that cj)^ u \v) denotes the 
ideally fair revenue (utility) obtained by organization 0^ u \ We would like the values cj)( u \v) to satisfy the 
fairness properties, first proposed by Shapley [36 ] (below we give intuitive motivations; see 1361 for further 
arguments). 

1) efficiency - the total value v(C) is distributed: 

53 <t>^{v(C))=v{C). 

2) symmetry - the organizations and 0^ u '^ having indistinguishable contributions obtain the same 
profits: 

(Wo(«), («')^' «(C u {0 (u) }) = v(C u {O^})) =► ^(v(C)) = ^ u '\v(C)). 

3) additivity - for any two characteristic functions v and w and a function (v+w): Vccc (v+w)(C) = 
v(C) + w(C) we have that V C 'cc V u : 

^ u \(v+w){C)) = (j) {u) (v(C)) + (f> (u) (w(C)). 

Consider any two independent schedules u\ and o-i that together form a schedule 0-3 = o\ U 02 (o"i and 
a"2 are independent iff removing any subset of the jobs from a\ does not influence the completion time of 
any job in 02 and vice versa). The profit of an organization that participates only in one schedule (say a\) 
must be the same in case of o\ and 0-3 (intuitively: the jobs that do not influence the current schedule, also 
do not influence the current profits). The profit of every organization that participates in both schedules 
should in 0-3 be the sum of the profits in o\ and 02- Intuitively: if the schedules are independent then the 
profits are independent too. 

4) dummy - an organization that does not increase the value of any coalition C C C gets nothing: 

(V rcc : v(C U {OM}) = «(C')) ^ u) (v(C)) = 0. 

Since the four properties are actually the axioms of the Shapley value P6l . they fully determine the 
single mapping between the coalition values and the profits of organizations (known as the Shapley value). 
In game theory the Shapley value is considered the classic mechanism ensuring the fair division of the 
revenue of the coalitioqj. The Shapley value can be computed by the following formula [36]: 

^)(v(C)) = £ M^ KC' U {OM}) - v(C')) (1) 

e'cc\{o(">} 

Let Cc denote all orderings of the organizations from the coalition C. Each ordering can be associated 
with a permutation of the set C, thus ||>Ce|| = ||C||!. For the ordering £-c we define -<c (O®) = 

4 The Shapley value has other interesting axiomatic characterizations 11431 . 
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{qO) g c : 0u) _<; c O^} as the set of all organizations from C that precede in the order -<c- The 
Shapley value can be alternatively expressed [31 ] in the following form: 



^WC)) = = £ (,K(oM)u{oW})-,H c (oM)). (2) 

This formulation has an interesting interpretation. Consider the organizations joining the coalition C in the 
order -<c- Each organization 0^ u \ when joining, contributes to the current coalition the value equal to 
[v(< c U {0<»}) - v(< c (0 (u) )). Thus, 0( u )(u(C)) is the expected contribution to the coalition C, 

when the expectation is taken over the order in which the organizations join C. Hereinafter we will call the 
value 0(*O (v(C) (or using a shorter notation (fr( u >) as the contribution of the organization 

Let us consider a specific scheduling algorithm A, a specific time moment t, and a specific coalition 
C. Ideally, the utilities of the organizations should be equal to the reference fair values, V M ip^ (C) = 
4>( u > (v(C)), (meaning that the utility of the organization is equal to its contribution), but our scheduling 
problem is discrete so an algorithm guaranteeing this property may not exist. Thus, we will call as fair an 
algorithm that results in utilities close to contributions. The following definition of a fair algorithm is in two 
ways recursive. A fair algorithm for a coalition C and time t must be also fair for all subcoalitions C C C 
and for all previous t' < t (an alternative to being fair for all previous t' < t would be to ensure asymptotic 
fairness; however, our formulation is more responsive and more relevant for the online case. We want to 
avoid the case in which an organization is disfavored in one, possibly long, time period and then favored in 
the next one). 

Definition 3.1 Set an arbitrary metric \\ ■ ||d : 2 fc x 2 k — > M>o/ and set an arbitrary time moment t G T, A 
is a fair algorithm in tfor coalition C in metric \\ ■ \\d if and only if: 

A G &r grain A , eJ r( <t) \\(p(A',C,t) - ip{v(A', C, t)\\ d 

where: 

1. J~(< t) is a set of algorithms fair in each point t' < t; J~(< 0) is a set of all greedy algorithms, 

2. ifj(v(A',C) is a vector of utilities {^ u \v(A' ,C))), 

3. $(A', C) is a vector of contributions (<p^ {v(A' , C))), where ^ (v(A', C)) is given by Equation \J\ 

4. In Equation\J\ for any C C C, v(C) denotes v(Af,C), where Af is any fair algorithm for coalition 

c. 

Definition 3.2 A is a fair algorithm for coalition C if and only if it is fair in each time i S T. 

Further on, we consider algorithms fair in the Manhattan metricJl: H'Wi^Hm = 2~2i=i \ v ify] ~ u 2^]|- 
Based on Definition 13.21 we construct a fair algorithm for an arbitrary utility function tp (Algorithm [TJ). 
The algorithm keeps a schedule for every subcoalition C C C. For each time moment the algorithm comple- 
ments the schedule starting from the subcoalitions of the smallest size. The values of all smaller coalitions 
v[C s ] are used to update the contributions of the organizations (lines I23H27T ) in the procedure UpdateVals). 
Before scheduling any job of the coalition C the contribution and the utility of each organization in C is 
updated (procedure UpdateVals). If there is a free machine and a set of jobs waiting for execution, the 
algorithm selects the job according to Definition 13.11 thus it selects the organization that minimizes the dis- 
tance of the utilities tp to their ideal values (procedure SelectAndSchedule). Assuming the first job 

5 Our analysis can be generalized to other distance functions. 
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Algorithm 1: Fair algorithm for arbitrary utility func- 
tion ip. 
Notation: 

jobs[C] [0 ( - u ' ) ] — list of waiting jobs of organization 

o (u \ 

4>[C) [0 <u) ] — the contribution of {u) in C, <f) {u) {C). 
ip[C] [0 (u) ] — utility of (u) from being in C, 

V(c,o (u) ). 

\[C] — value of a coalition C. 
a[C] — schedule for a coalition C. 
FreeMachinefer, t) — returns true if and only if 
there is a free machine in a in time t. 



2 ReleaseJob (0 (u) , J) : 



for C : {u) G C do 
| jobs[C][0 (u) ].push(J) 

Distance (C, O iu) , t) : 
old «- a[C]; 

new <- a[C] U { (jobs[C][0 (u) ]. first, t)}; 
Aip <- i>(new, O iu \t) - ip(old, (u) , t); 
return |<?!>[C][0 (tl) 



+ 



0[C][O («') ] + ^|_ V , IC][O («') ] |. 



13 SelectAndSchedule (C, t) : 



14 
1? 
If. 
17 



u argmin ( U ) (Distance(C, O 1 "-', t)) 
a[C] <- a[C] U { (jobs [C]\u]. first, t)}; 
V[C][0 ( "»] <- V(^[C],0 (u) ,f); 



is UpclateVals (C, i) : 



19 
20 
21 

22 
23 
24 
25 
2d 
27 



foreach C» (li) G C do 
<^)[C][0 (U) ] ^0; 

:c]^Eow^].o w ,«); 

foreach C sub : C su6 C C do 
foreach (u) G C su6 do 

0[C][O (u) ] ^[C][0 (u) ]+ 

(v[C su6 ] -v[C stlb \{0 (tl) }]) 

(l|C s „ 6 ||-l)!(||C||-||C a „ 6 ||)! . 



I|C||! 



29 FairAlgorithm (C) : 



Algorithm 2: Function SelectAndSchedule for utility 
function ip 3p . 

1 SelectAndSchedule (C, i) : 

2 u «- axgmin ( u) (V[C][OW] - 0[C][O (u) ]) ; 

3 ff[C] «- cr[C] U { (jobs[C][u]. first, t)}; 

^[C][oW]^VHC],o (u) ,t); 



Algorithm 3: Fair algorithm for arbitrary utility function for 
utility function ip sp and for unit-size jobs. 



Notation: 

e, A — as in Theorem |5.6l 



l 

2 Prepare (C) : 

3 



JV<_ rjicjlh, (ML 

r generate JV random orderings (permutations) of 
the set of all organizations (with replacement); 
5 Subs «- Subs' <- ; 
(. foreach -<erdo 
7 forw-s- Ito ||C|] do 

C <- {O w : O w -< (u) } ; 
Subs <- Swbs U {C}; 
Subs' «- Subs' U {C U {0 (ll) }} ; 



n ReleaseJob (0 (u) , J) : 



12 
13 
14 



for C G Subs U Subs' : (tl) G C do 
I jobs[C'][0 (tl) ].push(J) 



is SelectAndSchedule (C, t) : 



16 
17 
IS 
19 

20 



u <- argmin o(u) (^[C][0< u )] - «[0'"']) 
a[C] <- ff[C] U { (jobs [C][u]. first, t)}; 
fmPerOrg[0 (u) ] <- finPerOrg[0 (ll) ] + 1; 
<^[0 (u) ] <- <?(>[0 (u) ] + 1; 



21 FairAlgorithm (C) : 



30 


foreach time moment t do 


31 




foreach job j\ ■ : rf' = t do 


32 






ReleaseJob (Of°,J> UJ ); 


33 




for s • 


- Ito ||C|| do 


34 






foreach C' C C, jmc/i that \\C'\ \ = s do 


35 








UpdateVals (C',t) ; 


36 








while FreeMachine (<r[C'], t) do 


37 










SelectAndSchedule (C',t) ; 


38 








V 


C]^J2 oM i,(a[C],0^,t); 



Prepare (C) ; 
foreach Sk moment t do 

foreach y'oi : = t do 

| ReleaseJob (O^', jf° ) ; 
foreach C C Subs U Subs' do 

v[C] <- v[C] + fmPerCoal[C] ; 
n <- min(EoC-) 6 c mW > lljobs[C][0 
remove first n jobs from jobs[C][0'"'] ; 
finPerCoal[C] <- finPerCoal[C] + n ; 
v[C] <- v[C] + n ; 
foreach O fu) G C do 

^[0 (tl) ] <- ^[0 (u) ] +finPerOrg[0 (u) ] 
0[O (u) ] 4- 0; 

foreach C G Subs : (u) £ C do 

marg_(^ v[C U {0 (u) }] - v[C] 
0[OW]^</»[OW]+marg.</,.i; 
while FreeMachine (o[C], t) do 
| SelectAndSchedule (C,t) ; 
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of the organization OW is tentatively scheduled, the procedure Distance computes a distance between 
the new values of ift and eft. The procedure Distance works as follows. Assuming OW is selected the 
value Aip denotes the increase of the utility of thanks to scheduling its first waiting job. This is also 
the increase of the value of the whole coalition. When procedure Distance(C, 0^ u \ t) is executed, the 
schedules (and thus, the values) in time t for all subcoalitions C C C are known. The schedule, for coalition 
C is known only in time (t — 1), as we have not yet decided which job should be scheduled in t. Thus, 
scheduling the job will change the schedule (and the value) only for a coalition C. From Equation [Hit fol- 
lows that if the value v{C) of the coalition C increases by Atp and the value of all subcoalitions remains the 
same, then the contribution <ft( u > of each organization 0^ u ' G C to C will increase by the same value equal 
to AVVI|C||. Thus, for each organization {u '^ G C the new contribution of is (<p[C}{0^\ + 

The new utility for each organization (u,) G C, such that ^ (u) is equal to ift[C][0^ un >]. The new 
utility of the organization O™ is equal to (^[C][0^]| + Aift). 

Theorem 3.3 Algorithm\J}is a fair algorithm. 

Proof. Algorithm ITIis a straightforward implementation of Definition I3T21 □ 

Proposition 3.4 In each time moment t the time complexity of Algorithm\l\is O(||0||(2ll o ll m^+3^°^)). 

Proof. Once the contribution is calculated, each coalition in t may schedule at most ^ wS u > jobs. The 
time needed for selecting each such a job is proportional to the number of the organizations. Thus, we 
get the ||0||2ll H m ^ °f tne complexity. For calculating the contribution of the organization 
to the coalition C the algorithm considers all subsets of C - there are 2" c " such subsets. Since there are 
( J ) coalitions of size k, the number of the operations required for calculating the contributions of all 
organizations is proportional to: 



lien ,„^„ N no 



EE C'T'y = m E ("^""ji^ii-^^ = noii(i +2)ii°n = nopi 

f,.\ i — n \ / u—n \ / 



Oil 



(u) k=0 v y k=0 

This gives the ||0||3ll H part of the complexity and completes the proof. □ 
Corollary 3.5 The problem of finding fair schedule parametrized with the number of organizations is FPT. 



4 Strategy-proof utility functions 

There are many utility functions considered in scheduling, e.g. flow time, turnaround time, resource uti- 
lization, makespan, tardiness. However, it is not sufficient to design a fair algorithm for an arbitrary utility 
function tp. Some functions may create incentive for organizations to manipulate their workload: to divide 
the tasks into smaller pieces, to merge or to delay them. This is undesired as an organization should not 
profit nor suffer from the way it presents its workload. An organization should present their jobs in the most 
convenient way; it should not focus on playing against other organizations. We show that in organizationally 
distributed systems, as we have to take into account such manipulations, the choice of the utility functions 
is restricted. 

For the sake of this section we introduce additional notation: let us fix an arbitrary organzation OW 
and let a t denote a schedule of the jobs of in time t. The jobs J{(si,Pi) of are characterized by 
their start times Sj and processing times p^. We are considering envy-free utility functions that for a given 
organization O'"' depend only on the schedule of the jobs of 0™. This means that there is no external 
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economical relation between the organization (the organization O u cares about O v only if the jobs of O v 
influence the jobs of O u - in contrast to looking directly at the utility of O v ). We also assume the non- 
clairvoyant model - the utility in time t depends only on the jobs or the parts of the jobs completed before 
or at t. Let us assume that our goal is to maximize the utility functional We start from presenting the desired 
properties of the utility function tp (when presenting the properties we use the shorter notation ip(a t ) for 

1) Tasks anonymity (starting times) — improving the completion time of a single task with a certain pro- 
cessing time p by one unit of time is for each task equally profitable - for s, s' < t — 1, we require: 

i/>(<rt U {(s,p)}) - rP(a t U {0 + l,p)}) = rP(a[ U {(«',?)}) - rp(a' t U {(*' + l,p)}) > 0. 



2) Tasks anonymity (number of tasks) — in each schedule increasing the number of completed tasks is 
equally profitable - for s < t — 1, we require: 

$(o t U {(s,p)}) - iP(a t ) = i>(a' t U {(s,p)}) - ^(a' t ) > 0. 



3) Strategy -resistance — the organization cannot profit from merging multiple smaller jobs into one larger 
job or from dividing a larger job into smaller pieces: 

i)(o t U {{s,pi)}) + ^(a t U {(s + Pl ,p 2 )}) = i)(o t U {{s,pi + P2 )}). 

In spite of dividing and merging the jobs, each organization can delay the release time of their jobs and 
artificially increase the size of the jobs. Delaying the jobs is however never profitable for the organization 
(by property 1). Also, the strategy-resistance property discourages the organizations to increase the sizes 
of their jobs (the utility coming from processing a larger job is always greater). 

To within a multiplicative and additive constants, there is only one utility function satisfying the afore- 
mentioned properties. 

Theorem 4.1 Let ipbe a utility function that satisfies the 3 properties: task anonymity (starting times); task 
anonymity (number of tasks); strategy-resistance, ip is of the following form: 

a ■ , , \,t^ s + min(s +p — l,t — 1). 
iP(a,t) = 2^mm(p,t-s)(ifi-if 2 ^ j- ■ '-) + K 3 , 

(s,p)e<r t 

where 

1. K 1 = ^(aU{(Q,l)},t)-i>(o-)>0 

2. K 2 = TP{o-U{{s, P )},t)-iP(aU{{s + l,p)},t) > 

3. K 3 = V(0). 



Proof. 

ip(cr,t) = ip(\^J{s,p), t) = Tp([J(s,mm(p,t — s)),t) (non-clairvoyance) 

6 we can easily transform the problem to the minimization form by taking the inverse of the standard maximization utility 
function 
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min(s+p— l,f — 1) 

V>( (i,l),t) (strategy-resistance) 

(s,p)e<r i=s 

min(s+p— l,t— 1) min(s+p— l,t— 1) 

ip( (0, l),t) — Ki (starting times anonymity) 

(s,p)SfT *=* (s,p)Gcrt i=s 

min(s+p— 1) 

■0(0) + ^ 1 (number of tasks anonymity) 

(s,p)So- t i=s 



rr ■ ^ + ,s + min(s+p- 1) 
- A 2 2^ mm (P 5 * - s) ^ 

(s,p)e<r t 

#3 + 2^ min(p, t - s) (#i - K 2 i Jl 1 1 



(sum of the arithmetic progression) 



(s,p)e<r t 



□ 



We set the constants K\ , K 2 , Kj, so that to simplify the form of the utility function and ensure that the 
utility is always positive. With K\ = 1, K 2 = t and K% = 0, we get the following strategy -proof utility 
function: 

4>s P {cr,t) = 2_^mm{p,t - s) I t 1. (3) 

(s,p)€a:s<t ^ ~ / 

t/j sp can be interpreted as the task throughput. A task with processing time pi can be identified with 
Pi unit-sized tasks starting in consecutive time moments. Intuitively, the function ip sp assigns to each such 
unit-sized task starting at time t s a utility value equal to (t — t s ); the higher the utility value, the earlier 
this unit-sized task completes. A utility of the schedule is the sum of the utilities over all such unit-sized 
tasks. ip sp is similar to the flow time except for two differences: (i) Flow time is a minimization objective, 
but increasing the number of completed jobs increases its value. E.g., scheduling no jobs results in zero 
(optimal) flow time, but of course an empty schedule cannot be considered optimal (breaking the second 
axiom); (ii) Flow time favors short tasks, which is an incentive for dividing tasks into smaller pieces (this 
breaks strategy-resistance axiom). The differences between the flow time and ip sp is also presented on 
example in Figure Q] The similarity of tp sp to the flow time is quantified by Proposition !4.2l below. 

Proposition 4.2 Let J be a fixed set of jobs, each having the same processing time p and each completed 
before t. Then, maximization of the ip sp utility is equivalent to minimization the flow time of the jobs. 

Proof. Let a denote an arbitrary schedule of J. Since the flow time uses the release times of the jobs, we 
will identify the jobs with the triples (s,p, r) where s, p and r denote the start time, processing time and 
release time, respectively. Let tpft( cr ) denote the total flow time of the jobs from J in schedule a. We have: 



ip sp {a, t) = 2_^ nun(p, t - s) It 

(s,p,r)£a:s<t 



p ~\ (each job is completed before t) 



(s,p,r)£cr 

= ^2(pt + P 2 P - r) - pJ^dP + s ) - r ) 

(s,p,r)e<J (s,p,r)£a 
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Figure 1: Consider 9 jobs owned by and a single job owned by 0^ 2 \ all scheduled on 3 processors. We 
assume all jobs were released in time 0. In this example all jobs finish before or at time t = 14. The utility 
ip sp of the organization in time 13 does not take into account the last uncompleted unit of the job Jg, 
thus it is equal to: 3 • (13 - 2|2) + 4 • (13 - 2±2) + . . . + 3 . (13 _ 9±!i) + 3 . (13 _ IQ±12) = 2 62. The utility 
in time 14 takes into account all the parts of the jobs, thus it is equal to 3 • (14 — + 4 • (14 — ^±2) _|_ 
■ • • + 3 • (14 - 5^1 ) + 4 ■ (14 - i2±13 ) = 297. The flow time in time 14 is equal to 3 + 4 + • • • + 14 = 70. If 

there was no job J\ , then Jg would be started in time 9 instead of 10 and the utility il) sp in time 14 would 
increase by 4 • ( 10 + 13 — ^p L ) = 4 (the flow time would decrease by 1). If, for instance, Jq was started 
one time unit later, then the utility of the schedule would decrease by 6 (the flow time would decrease by 
1), which shows that the utility takes into account the sizes of the jobs (in contrast to the flow time). If 
the job Jg was not scheduled at all, the utility ip sp would decrease by 10, which shows that the schedule 
with more tasks has higher (more optimal) utility (the flow time would decrease by 14; since flow time is a 
minimization metric, this breaks the second axiom regarding the tasks anonymity). 

= \\J\\(pt + P -^)-Y,(r)-P^ft{v) 

Since p, \\J\\(pt + ^-^-) and J2( s p r)eu r are constants we get the thesis. □ 



5 Fair scheduling with strategy-proof utility 

For the concrete utility function ip sp we can simplify the SelectAndSchedule function in AlgorithmQ] 
The simplified version is presented in Algorithm [2] 

The algorithm selects the organization that has the largest difference ((ft u > — ip^) that is the 
organization that has the largest contribution in comparison to the obtained utility. One can wonder whether 
we can select the organization in polynomial time - without keeping the 2" c " schedules for all subcoalitions. 
Unfortunately, the problem of calculating the credits for a given organization is NP-hard. 

Theorem 5.1 The problem of calculating the contribution <f)( u '(C, t) for a given organization 0^ u ' in coali- 
tion C in time t is NP-hard. 

Proof. We present the reduction of the SubsetSum problem (which is NP-hard) to the problem of calcu- 
lating the contribution for an organization. Let I be an instance of the SubsetSum problem. In / we are 
given a set of k integers S = {x\,X2, . . . ,Xk} and a value x. We ask whether there exists a subset of S 
with the sum of elements equal to x. From I we construct an instance I con of the problem of calculating the 
contribution for a given organization. Intuitively, we construct the set of (\\S\\ + 2) organizations: \\S\\ of 
them will correspond to the appropriate elements from S. The two dummy organizations a and b are used 
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for our reduction. One dummy organization a has no jobs. The second dummy organization b has a large 
job that dominates the value of the whole schedule. The instance I con is constructed in such a way that for 
each coalition C such that b € C and such that the elements of S corresponding to the organizations from 
C sum up to the value lower than x, the marginal contribution of a to C is L + O(L), where O(L) is small 
in comparison with L. The marginal contribution of a to other coalitions is small (O(L)). Thus, from the 
contribution of a, we can count the subsets of S with the sum of the elements lower than x. By repeating 
this procedure for (x + 1) we can count the subsets of S with the sum of the elements lower than (x + 1). 
By comparing the two values, we can find whether there exists the subset of S with the sum of the elements 
equal to x. The precise construction is described below. 

Let S <x = {S' C S : YlxeS' s i < x } b e tne set of the subsets of S, each having the sum of the 
elements lower than x. Let n <x (S) = Yls'eS X (\\S'\\ + 1)KII^II ~~ ll^'ll)' be the number of the orderings 
(permutations) of the set S U {a, 6} that starts with some permutation of the sum of exactly one element 
of S <x (which is some subset of S such that the sum of the elements of this subset is lower than x) and 
{b} followed by the element a. In other words, if we associate the elements from S U {a, b} with the 
organizations and each ordering of the elements of S U {a, b} with the order of the organizations joining the 
grand coalition, then n <x (S) is the number of the orderings corresponding to the cases when organization 
a joins grand coalition just after all the organizations from S' U {b}, where S' is some element of S <x . Of 
course S <x C S < ^ x+ iy Note that there exists S' C S, such that Ylx-eS' Xi = xif and only if the set S <x is a 
proper subset of <S < ( a . +1 ) (i.e. S <x C 5 < ( x+1 )). Indeed, there exists S' such that S' £ S <x and S' € tS<( x+ i) 
if and only if Ylx eS' Xi < x + 1 an d Ylx eS' Xi — x f rom which it follows that Ylx-eS' Xi = x. Also, 
S< x C iSws+i) if and only if n < ( a , +1 ) (S) is greater than n < ^ (S) (we are doing a summation of the positive 
values over the larger set). 

In Icon there is a set of (k + 2) machines, each owned by a different organization. We will denote the 
set of first k organizations as Os, the (k+l)-th organization as a and the (k+2)-th organization as b. Let 
xtot = Ylj=i x j + 2- The 2-th organization from Os has 4 jobs: , j\ , J3 and with release times 

= = 0, 7*3 = 3 and r| = 4; and processing times pf' = = 1, pjj = 2xt Q j and = 2xj. 
The organization a has no jobs; the organization b has two jobs and J^, with release times r[ b ^ = 2 and 

rg = (2s + 3); and processing times ' = (2x + 2) and p 2 = L = 4:\\S\\Xf 0t ((k + 2)1) + 1 (intuitively 
L is a large number). 

Until time f = 2 only the organizations from Os have some (unit-size) jobs to be executed. The 
organization b has no jobs till time t = 2, so it will run one or two unit-size jobs of the other organizations, 
contributing to all such coalitions that include b and some other organizations from Os- This construction 
allows to enforce that in the first time moment after t = 2 when there are jobs of some of the organizations 
from Os and of b available for execution, the job of b will be selected and scheduled first. 

Let us consider a contribution of a to the coalition C such that a £ C and b G C. There are (||Cn0s|| + 2) 
machines in the coalition C U {a}. The schedule in C U {a} after t = 2 looks in the following way (this 
schedule is depicted in Figure[2]). In time t = 2 one machine (let us denote this machine as M') starts the job 
jj^ In time t = 3 some \\C n Os\\ machines start the third jobs (the one with size 2xtot) of the organizations 
from C Pi O and one machine (denoted as M") starts the fourth jobs of the organizations from C n Os', 
the machine M" completes processing all these jobs in time 2y + 4, where y = Sj-oWeCAOWeOs Xi (°^ 
course 2y + 4 < 2x to t)- In time (2x + 3), if y < x the machine M" starts processing the large job J^ 1 
of the organization b; otherwise machine M" in time (2x + 3) still executes some job (as the jobs jf 1 
processed on M" start in even time moments). In time 2x + 4, if y > x, the large job Jg i s started by 
machine M' just after the job j[ b ' is completed, (J^ completes in (2x + 4)); here we use the fact that after 

t = 2, b will be prioritized over the organizations from Os- To sum up: if y < x then the large job is 
started in time (2x + 3), otherwise it is started in time (2x + 4). 
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b.) 
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Fi gure 2: The schedules for the coalition C U {a} for two cases: a) X^oWgCaoWgOs Xi — Xy 
b) Y^i-oi^eCAO^eOs Xi > x - The two cases a ) an d b) differ only in the schedules on machines M' and M". 
In the case a) the large job (marked as a light gray) is started one time unit earlier than in case b). 



If y < x then by considering only a decrease of the starting time of the largest job, the contribution of a 
to the coalition C can be lower bounded by c\ : 



cx = L[t 



J (2x + A) + (2x + A + L)Y ] L 



(2s + 3) + (2x + 3 + L) 



The organization a causes also a decrease of the starting times of the small jobs (the jobs of the organizations 
from Os); each job of size smaller or equal to 2x to t- The starting time of each such small job is decreased 
by at most 2xtot time units. Thus, the contribution of a in case y < x can be upper bounded by C2- 



c 2 < L + 4\\S\\x 



tot- 



If y > x then a causes only a decrease of the starting times of the small jobs of the organizations from 
Os, so the contribution of a to C in this case can be upper bounded by C3: 



c 3 < 4||5||x? ot . 

By similar reasoning we can see that the contribution of a to any coalition C such that b fi C is also upper 
bounded by 4|j5||x| oi . 

The contribution of organization a, cj)( a \ is given by EquationQ] with u = a and C = {O^ . . . 0^ k+2 ^}. 
Thus: 



C'CC\{a} K >' 

where marg_0(C', a) is the contribution of a to coalition C. All the coalitions C such that a fi C', b G C 
an< i Z~2r0^eCnOs x i < x wm contribute to (p^ the value at least equal to "^+2)? Cl = ( as there 
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is exactly n <x (S) orderings corresponding to the the case when a is joining such coalitions C) and at most 
equal to ! Jf^c 2 < n< " (5) 2 ( ( ^ 8 2 jf l|x ' nt) . The other (k + 2)! - n <x {S) orderings will contribute to <j)^ the 
value at most equal to ^feffi (S)) c 3 = Also: 

((fc + 2)!-n <3! (5))(4||5||xL) , r^Qg^fe) 2 L 

(fc + 2)! (fc + 2)! 11 11 tot (k + 2)V 

which means that cp^ can be stated as cfr^ = n ^ : +2)\ J + ^> wnere < -R < n^W- We conclude that 
|^ (fc+2^!0( ) j _ n< ^^gy yj e have shown that calculating the value of (p^ allows us to find the value n <x (S). 
Analogously, we can find n < ^ x+ ^(S). By comparing n <x (S) with n-<( x +i) (S) we find the answer to the 
initial SubsetSum problem, which completes the proof. 

□ 

We propose the following definition of the approximation of the fair schedule (similar definitions of the 
approximation ratio are used for multi-criteria optimization problems Q): 

Definition 5.2 Let a be a schedule and let ip be a vector of the utilities of the organizations in a. We say 
that a is an a- approximation fair schedule in time t if and only if there exists a truly fair schedule a*, with 
the vector ip* = (^W'*) of the utilities of the organizations, such that: 

H-r\\ M <a\\r\\ M = a^2^* =a-v(a*,C). 

u 

Unfortunately, the problem of finding the fair schedule is difficult to approximate. There is no algorithm 
better than 1/2 (the proof below). This means that the problem is practically inapproximable. Consider two 
schedules of jobs of m organizations on a single machine. Each organization has one job; all the jobs are 
identical. In the first schedule a or d the jobs are scheduled in order: , j[ 2 ^ , . . . j[ m ^ and in the second 
schedule a rev the jobs are scheduled in exactly reverse order: j[ m \ j[ m 1 \ . . . The relative distance 
between a or d and a rev tends to 1 (with increasing m), so ( ^-approximation algorithm does not allow to 
decide whether a or d is truly better than a rev . In other words, |) -approximation algorithm cannot distinguish 
whether a given order of the priorities of the organizations is more fair then the reverse order. 

Theorem 5.3 For every e > 0, there is no polynomial algorithm for finding the (i — ^-approximation fair 
schedule, unless P = NP. 

Proof. Intuitively, we divide time in (||£>|| 2 + 3) independent batches. The jobs in the last batch are 
significantly larger than all the previous ones. We construct the jobs in all first (||S|| 2 + 2) batches so that 
the order of execution of the jobs in the last batch depends on whether there exists a subset S' C S such that 
eS' Xi = x - ^ tne su bset does not exist the organizations are prioritized in some predefined order aord', 
otherwise, the order is reversed o~ rev . The sizes of the jobs in the last batch are so large that they dominate 
the values of the utilities of the organizations. The relative distance between the utilities in a^d and in a rev 
is (1 — e) so any (| — e) -approximation algorithm A would allow to infer the true fair schedule for such 
constructed instance, and so the answer to the initial SubsetSum problem. The precise construction is 
described below. 

We show that if there is an — e) -approximation algorithm A for calculating the vector of the contri- 
butions, then we would be able to use A for solving the SubsetSum problem (which is NP-hard). This 
proof is similar in a spirit to the proof of Theorem 15.11 Let I be an instance of the SubsetSum problem, 
in which we are given a set S = {x±,X2, • • • , x/c} of k integers and a value x. In the SubsetSum problem 
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we ask for the existence of a subset S' C S such that Yl Xi eS' Xi = x; we wn ^ ca ^ tne su bsets S' C S such 
that Ylx-eS' x i = x the x-sum subsets. 

From I we construct the instance of the problem of calculating the vector of contributions in the fol- 
lowing way. We set O = Os U {a} U B to be the set of all organizations where Os = {0\, . . . , Ok} 
(|| Os || = k) is the set of the organizations corresponding to the appropriate elements of S and {a} U B, 
where B = {B\, . . . , B{} (I = ||Z?|| will be defined afterwards; intuitively I » fc), is the set of dummy 
organizations needed for our construction. 

We divide the time into (||i3|| 2 + 3) independent batches. The batches are constructed in such a way that 
the (j + l)-th batch starts after the time in which all the jobs released in j-th batch are completed in every 
coalition (thus, the duration of the batch can be just the maximum release time plus the sum of the processing 
times of the jobs released in this batch). As the result, the contribution of each organization 0(") is 
the sum of its contributions in the all (||S|| 2 + 3) batches. For the sake of the clarity of the presentation we 
assume that time moments in each batch are counted from 0. 

We start from the following observation: if the sum of the processing times of the jobs in a batch is equal 
to Psum, then the contribution of each organization can be upper bounded by p 2 urn . This observation follows 
from the fact that any organization, when joining a coalition, cannot decrease the completion time of any job 
by more than p sum - As the total number of unit-size parts of the jobs is also p sum , we infer that the joining 
organization cannot increase the value of the coalition by more than p 2 sum . The second observation is the 
following: if the joining organization causes decrease of the completion time of the task with processing 
time p, then its contribution is at least equal to jj^p (as it must decrease the start time of the job by at least 
one time unit in at least one coalition). 

Let x to t = Yl k j=i x j- I n our construction we use 4 large numbers L,XL,H and XH, where L = 
(\\0\\ + l + 4\\B\\ 2 x 2 tot ) ■ \\0\\\; XL = (Ol L- \\0\\(\\0\\ + 1)) 2 + Ol4\\B\\ 2 x 2 tot + 1, H = ||£|| 2 (2||0||(1 + 
x tot ) + 2x + XL) 2 + 1 and XH is a very large number that will be defined afterwards. Intuitively: XH S> 
H > XL > L > xtot- 

In the first batch only the organizations from B release their jobs. The 2-th organization from B releases 

2i jobs in time 0, each of size L. This construction is used to ensure that after the first batch the i-th 

organization from £> has the difference (<^W— ^W) greater than the difference (</>(* +1 ) —tfj^ +1 ' > ) of the («+l)- 

th organization from B of at least ^ = (||C|| + 1 + A\\B\\ 2 x 2 ot ) and of at most p 2 um = (L- ™Mbtl) )2 < 

Ml _ 4||«I|2 T 2 
Q\ x tof 

In the second batch, at time 0, all the organizations except for a release 2 jobs, each of size H. This 
construction is used to ensure that after the second batch the contribution (and so the the difference (cp — %/})) 
of the organization a is large (at least equal to H, as a joining any coalition causes the job of size H to be 
scheduled at least one time unit earlier). Since in each of the next ||£>|| 2 batches the total size of the released 
jobs will be lower than (2j|Oj|(l + x to t) + 2x + XL), we know that in each of the next ||i3|| 2 batches the 
jobs of a will be prioritized over the jobs of the other organizations. 

Each of the next ||^|| 2 batches is one of the 2||Z3|| different types. For the organization Bi (1 < i < \\B\\) 
there is exactly i batches of type Bch(£?j, 2x + 1) and (||£>|| — i) batches of type Bch(£?j, 2x). The order of 
these ||£>|| 2 batches can be arbitrary. 

The batches Bch(£?j, 2x) and Bch(Bi, 2x + 1) are similar. The only difference is in the jobs of the 
organization a. In the batch Bch(Bj, 2x) the organization a has two jobs and with release times 
= and = 2x and processing times p^ = 2x + 1 and p^ = XL. In the batch Bch(Sj, 2x + 1) 
the organization a has two jobs j[ a ^ and with release times = and = 2x + 1 and processing 
times p^ =2x + 2 and = XL. All other organizations have the same jobs in batches Bch(i?j, 2x) and 
Bch(5j, 2x + 1). The organization B^ has no jobs and all the other organizations from B release a single 
job of size (2x to t + 2) in time 0. The j-th organization from Os has two jobs and with release 
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times r\ =0 and r 2 = 1 and processing times = 2xtot + 1 and p 2 = 2xj. 

Finally, in the last (||£>|| 2 + 3)-th batch only the organizations from B release their jobs. Each such 
organization releases ||C|| jobs in time 0, each of size XH. 

Now let us compare the schedules for the batches Bch(Bi, 2x) and Bch(Bj, 2x + 1) (see Figure©. Let 
us consider a schedule for a coalition C. Let Os,c = @s H C'\ let Be = B H C \ {Bi}. Let J\ denote the 
set of HC^c' II j°bs of sizes 2xtot + 1 (these are the first jobs of the organizations from Os£')- Let Ji denote 
the set of ||0g c> II j°hs of sizes from S (the second jobs of the organizations from Og^')- Let J% denote the 
\\Bc\\ jobs of the organizations from Be of sizes 2xtot + 2 (the single jobs of these organizations). 

:OieO qn , Xi > x or Y^xf.o^Oe r , x i < x the schedules for any C in batches Bch(Sj, 2x) and 
Bch(i?j, 2x + 1) looks similarly. In time 0, HCs.C'll machines will schedule the HO^c'll j orjs from J\ (let 
us denote these machines as M.) and ||^c'll machines will schedule the \\Bc\\ jobs from J73- If Bi £ C 
then the jobs from Ji will be scheduled on the machines from A4 just after the jobs from J\. If Bi € C 
and a ^ C , then the coalition C has (||0£c'|| + II^C'll + 1) machines; one machine will execute the jobs 
from J2. If B t € C and a G C then the coalition C has (||C?s,c'|l + II^C'll + 2) machines. One machine 
(denoted as M') will execute the jobs from J2 and one other machine (denoted as M") will execute the 
job j[ a \ Now, if Ylxi-OieO s c , x i < x then will be scheduled on M'\ otherwise on M" (this follows 
from the construction in the second batch - we recall that the jobs of a should be prioritized). Thus, as 
explained in Figure© if Yl x -O eO g c i Xi > x or Ex o &o sc , x i < x the contribution and the utility of 
each organization from B in two batches Bch(2?j, 2x) and Bch(£?j, 2x + 1) differ by at most 4x 2 ot . 

If YlfXi Oi&Og , x i = x > then the schedules for the cases: (i) Bi £ C' (ii) {Bi G C and a £ C) remain 
the same as in case Ei-o^eO c , x i ^ x - ^ OT the last case (Bi € C and a G C) the jobs from J\, from J2 

and j[ a ^ are scheduled in the same way as previously. However, the job will be scheduled in Bch(f?j, 
2x + 1) on machine M' (in the moment it is released) and in Bch(5j, 2x) on machine M' or M" (one time 
unit later than it was released). As explained in Figure [3] if there exists an x-sum subset S' C S, then the 
contribution of Bi in Bch(Bi, 2x + 1) will be greater by at least of — 4x 2 o4 than in Bch(.B;, 2x). 

As the result, if there does not exist an x-sum subset S' C S, then the difference (0w _ ^W) for the 
z-th organization from will be greater than the difference (0(* +1 ) — i/;( 4+1 )) for the (i + l)-th organization 
from B by at least (||C|| + 1) (from the construction in the first batch the difference (cj)^ — ^W) was greater 
than (0( m ) - V (m) ) by at least (||C|| + 1 + 4||£|| 2 x1 ot ), and as explained in Figure [3] the difference 
_ _ <frW -|- ^W) could change by at most 4||£>|| 2 x 2 ot ). Otherwise, the difference (</>W — ^w) 

for the i-th organization will be lower than for the (i + l)-th organization (as there are more batches of type 
Bch(B i+1 , 2x + 1) than of type Bch(Bj, 2x + 1)). 

Thus, if there does not exist an x-sum subset S' C S, then in the last batch the jobs of B\ will be 
scheduled first, than the jobs of B 2 , and so on - let us denote such schedule as a^d- On the other hand, if 
there exists an x-sum subset S' C S, the jobs in the last batch will be scheduled in the exactly reverse order 
- such schedule will be denoted as a rev . 

Now, let us assess the distance between the vector of utilities in case of two schedules a or d and a rev . 
Let us assume that \\B\\ is even. Every job of the organization Bi (1 < i < !L 2 Ji ) in the last batch is 
started — 2i + 1) time units earlier in a or( i than in a rev . The jobs of the organization B^ B ^ +1 _^ 

11/311 

(1 < i < LU) ^ scheduled — 2i + 1) time units later in <r or d than in a Tev . Since each such job 

consists of XH unit-size elements, the distance between the vector of utilities for a or d and a rev , denoted as 
A^, can be lower bounded by: 

IIBII/ 2 n + l + 2 M_ 2 ) 1 
Aip > 2\\0\\ ( 2i ~ 1 ) xh2 = \P\\\\B\\- 2 -XH 2 = -||C||||S|| 2 Xi? 2 . 
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Figure3: The schedule for the coalition C such that Bi G C'anda G C in batch Bch(i?j, 2x+l), for 3 cases: 
a ) Exi^igO^o, x * > x > b ) ExirOieOs.o, x * < x ' c ) E* j: o c',,.. x * = x - We compare B;'s contribution 
<fi on this schedule to schedule Bch(Bj, 2x) (not shown; the only differences are that pf^ = 2x + 1 and 
= 2x). Other organizations / l?j have utility equal to contribution in all cases considered here. As 
Bi has no jobs, it contributes only a single machine (corresponding to M'). Thanks to M', the small jobs 

(i) 

J2 execute at most 2x to t earlier (if there is no machine M , these jobs are executed at M). The total size 
of these small jobs is 2x to t- Regarding small jobs, the resulting contribution of Bi to C is bounded by 4xj ot . 

In case a) M' does not decrease the start time of the large job ; the same happens in batch Bch(Sj, 
2x). In case b) M' speeds up by 1; the same happens in batch Bch(Sj, 2x). In case c) M' also speeds 
up by 1; however, in batch Bch(Sj, 2x) M' does not decrease J^'s start time (J^ is always started 
at (2x + 1)). To summarize, Bi contribution to C in both a) and b) differs by at most kx\ ot between Bch(Sj, 
2x) and Bch(i?j, 2x + 1). In contrast, in c) the contribution in Bch(f?j, 2x + 1) is greater by at least 
XL — 4xj ot compared to the contribution in Bch(Bi, 2x). 

As a consequence, considering BiS contribution to all coalitions, if there exists an x-sum subset 
S' C S (case c), then the contribution of Bi in Bch(Bj, 2x + 1) is by at least — ^x\ ot greater than in 
Bch(Bi, 2x); if there is no such an x-sum subset, then the contribution of Bi in Bch(Bj, 2x + 1) and in 
Bch(i?j, 2x) differ by no more than 4xj ot . 



17 



Now, we can define XH to be the total size of the all except the last batch times |. Below we show how 
to bound the total utility ip tot of the true fair schedule (p m d or <r rev ) in the time t when all the jobs are 
completed. Each unit size part of the job completed in time t contributes to the utility the value 1. Each unit 
size part of the job executed in time t — 1 is worth 2, and so on. Since the jobs in the last batch are executed 
on 1 1 (9 1 1 machines and the duration of the batch is equal to the utility of the jobs from the last batch 

is equal to Yl}=i The jobs in all previous batches are started no earlier than in t — \\B\\XH — jXH. 
The duration of the all but the last batch can be upper bounded by \XH. There are ||0|| machines, so the 
utility of the jobs from the all but the last batch can be upper bounded by (||£>||Xi7 + ^XH)^XH. Thus 
we get the following bound on iptot: 

(\\B\\XH \ 
£ i+(\\B\\XH+ € -XH)C-XH)\ 

0\\ ( l + ^ XH \\B\\XH + \\\B\\XH 2 + ^XH 2 ) 
0\\ Q (1 + \\B\\XH) 2 + ^\\B\\ 2 XH 2 \ 

We can chose the size ||£>j| so that ( """Iff ) < 1 + |. As the result we have: 

^.>^(( i wr) 2+ l)>^> 1 - 

Finally let us assume that there exists — e) -approximation algorithm A that returns the schedule a 
for our instance. Now, if a is closer to a or( i than to a re v, we can infer that <j or( 2 is a true fair solution to our 
instance (and so the answer to the initial SubsetSum question is "yes"). Otherwise, a rev is a true solution 
(and the answer to the SubsetSum problem is "no"). This completes the proof. 

□ 



Iptot < 




5.1 Special case: unit-size jobs 

In case when the jobs are unit-size the problem has additional properties that allow us to construct an efficient 
approximation (however, the complexity of this special case is open). However, the results in this section do 
not generalize to related or unrelated processors. For unit-size jobs, the value of each coalition v (C) does 
not depend on the schedule: 

Proposition 5.4 For any two greedy algorithms A\ and A2, for each coalition C and each time moment t, 
the values of the coalitions v (Ai, C, t) and v(A2, C, t) are equal, provided all jobs are unit-size. 

Proof. We prove the following stronger thesis: for every time moment t any two greedy algorithms Ai and 
A2 schedule the same number of the jobs till t. We prove this thesis by induction. The base step for t = is 
trivial. Having the thesis proven for (t — 1) and, thus knowing that in t in both schedules there is the same 
number of the jobs waiting for execution (here we use the fact that the jobs are unit-size), we infer that in t 
the two algorithms schedule the same number of the jobs. Since the value of the coalition does not take into 
account the owner of the job, we get the thesis for t. This completes the proof. □ 
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As the result, we can use the randomized approximation algorithm for the scheduling problem restricted 
to unit-size jobs (Algorithm |3). The algorithm is inspired by the randomized approximation algorithm for 
computing the Shapley value presented by Liben-Nowell et al [26]. However, in our case, the game is not 
supermodular (which is shown in Proposition !5.5l below). and so we have to adapt the algorithm and thus 
obtain different approximation bounds. 

Proposition 5.5 In case of unit-size jobs the cooperation game in which the value of the coalition C is 
defined by v(C) = X^o^eC ^{P^ u ') is not supermodular. 

Proof. Consider a following instance with 3 organizations: a, b and c each owning a single machine. 
Organizations a and b in time t = release two unit size jobs each; the organization c has no jobs. We are 
considering the values of the coalitions in time t = 2; v({a, c}) = 4 (the two jobs are scheduled in time 0), 
v({b, c}) = 4, v({a, b, c}) = 7 (three jobs are scheduled in time and one in time 1) and v({c}) = (there 
is no job to be scheduled). We see that v({a, b, c}) + v({c}) < v({a, c}) + v({b, c}), which can be written 
as: 

v({a, c} U {b, c}) + v({a, c} n {b, c}) < v({a, c}) + v({b, c}). 
This shows that the game is not supermodular. □ 

In this algorithm we keep simplified schedules for a random subset of all possible coalitions. For each 
organization the set Subs[0^] keeps N = In f |z-H random coalitions not containing OW; for 

each such random coalition C which is kept in Subs[0^], Subs' [O^] contains the coalition C' U {O^}. 
For the coalitions kept in Subs[0^] we store a simplified schedule (the schedule that is determined by 
an arbitrary greedy algorithm). The simplified schedule allows us to find the value v(C) of the coalition 
C. Maintaining the whole schedule would require the recursive information about the schedules in the 
subcoalitions of C. However, as the consequence of Proposition 15.41 we know that the value of the coalition 
v(C) can be determined by an arbitrary greedy algorithmic. 

The third foreach loop in procedure Fair Algorithm (line[26]in Algorithm© updates the values of 
all coalitions kept in Subs and Subs'. From Equation |3]it follows that after one time unit if no additional job 
is scheduled, the value of the coalition increases by the number of completed unit-size parts of the jobs (here, 
as the jobs are unit size, the number of the completed jobs is finPerCoal[C']). In time moment t, all waiting 
jobs (the number of such jobs is ||jobs[C][0^]||) are scheduled provided there are enough processors (the 
number of the processors is Ylo^eC m ^)- If n additional jobs are scheduled in time t then the value of 
the coalition in time t increases by n. 

In the fourth foreach loop (line [32] in Algorithm [3]>, once again we use the fact that the utility of 
the organization after one time unit increases by the number of finished jobs (finPerOrg[CK u )]). in the 
last foreach loop (line [35]) the contribution of the organization is approximated by summing the marginal 
contributions marg_0 only for the kept coalitions. Theorem 15. 61 below gives the bounds for the quality of 
approximation. 

Theorem 5.6 Let ip denote the vector of utilities in the schedule determined by Algorithm\3\ If the jobs are 
unit-size, then A with the probability A determines the e- approximation schedule, i.e. gives guarantees for 
the bound on the distance to the truly fair solution: 

u-r\\ M <e\r\. 

7 In this point we use the assumption about the unit size of the jobs. The algorithm cannot be extended to the general case. In 
a general case, for calculating the value for each subcoalition we would require the exact schedule which cannot be determined 
polynomially (Theorem l5.lt . 
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Proof. Let us consider an organization OW participating in a coalition C and a time moment t. Let 
and ^( n )'* denote the contribution and the utility of the organization in a coalition C in time moment i 
in a truly fair schedule. Let v*(C) denote the value of the coalition C in a truly fair schedule. According to 
notation in Algorithm |3l let 0[O^] and ij}[0^] denote the contribution and the utility of the organization 

in a coalition C in time t in a schedule determined by Algorithm [3] Let N = In f |zf^ ■ First, note 

that -ij)[0^>] | < - 0[OM]|. Indeed, if the contribution of the organization (u) increases by 

a given value A(f> then Algorithm [3] will schedule Acp more unit-size jobs of the organization provided 
there is enough such jobs waiting for execution. 

Let X denote the random variable that with the probability nAry returns the marginal contribution of 

the organization to the coalition composed of the organizations preceding in the random order 
(of course, there is ||C||! such random orderings). We know that X G [0,v*(C)} and that E(X) = <p^ u ^* . 
Algorithm [3] is constructed in such a way that c/)[0^} = YliLa lj?-^*> where X{ are independent copies of 
X. Thus, E(</>[0(")]) = (/>(")•*. From Hoeffding's inequality we get the bound on the probability p e that 

= P (S ~ * < *"'* * M"* (C) ) < exp (>OT?p) = exp (" W) = W • 

The probability that <p — cjf > ev*(C) can be bounded by p e ||C|| = 1 — A. As the result, also the 
probability that ip — ip* > eu*(C) can be bounded by 1 — A, which completes the proof. □ 

The complexity of Algorithm [3] is ||C|| • N = ||C||^|-ln times the complexity of the single- 

organization scheduling algorithm. As a consequence, we get the following result: 

Theorem 5.7 There exists an FPRASfor the problem of finding the fair schedule for the case when the jobs 
are unit size. 

6 Experimental evaluation of the algorithms 

In the previous section we showed that the problem of finding a fair schedule is computationally intractable. 
However, the ideas used in the exponential and the FPRAS algorithms can be used as insights to create 
reasonable heuristics. In this section we present the experimental evaluation of the fairness of two simple 
algorithms described below. 

6.1 Algorithms 

We verified the following algorithms: 

RAND is Algorithm [3] used as a heuristic for workloads with jobs having different sizes. We verify two 
versions of the algorithm with N = 15 and N = 75 random subcoalitions. 

DlRECTCONTR keeps for each organization its utility ip sp (calculated in the same way as in Algo- 
rithmO and its approximate contribution <fi. The approximate contribution of each organization is estimated 
directly (without considering any subcoalitions) by the following heuristic. In each time moment t we con- 
sider the machines in a random order and assign waiting jobs to free machines. The job that is started on 
machine m increases the contribution <p of the owner of m by the utility of this job. The waiting jobs are as- 
signed to the machines in the order of decreasing differences (<p — tp) of the issuing organizations (similarly 
to Algorithm [3]). The pseudo code of this algorithm is presented in Algorithm [4] 
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Algorithm 4: The simple heuristic algorithm for the problem of fair scheduling. 
Notation: 

own (M ) , own (J) — the organization owning the machine M, the job J 
wait(0) — the set of released, but not-yet scheduled jobs of the organization O 

l 

2 FairAlgorithm (C) : 



3 


foreach O w £ C do 


4 




finUt[0 (u) ] «- ; 


5 




fmCon[0 {u) ] <- ; 






4>\o {u 




7 






>] <-0; 


8 


foreach time moment t do 






foreach (u) € C do 








</>[0 (ll) ] <- 0[O (u) ] + finConfO^]; 


11 






i/)[0 (u) ] <- V[O fu) ] +finUt[0 ( "']; 


12 




7 




enerate a random permutation of the set of all machines; 


13 




foreach m £ 7 do 


14 






if no* FreeMachine (m, t) then 


15 








J 4— RunningJob (m) ; 


16 








finUt[own (/) ] <- finUt[own (J) ] + 1 ; 


17 








finCon[own (m) ] finCon[own (m) ] + 1 ; 


18 




foreach m e 7 do 


19 






if FreeMachine (m, t) and Uo(") wait(0 (u) ) 7^ then 


20 








org <- ai'gmax o(u): _ t(o(u))#0 (0[O<")] - V[C (u) ]) ! 


21 








J <— first waiting job of org ; 


22 








start Job (J, m) ; 


23 








finUt [org] finUt [org] + 1 ; 
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finCon[own (m) ] finCon[own (m) ] + 1 ; 



ROUNDROBIN cycles through the list of organizations to determine the job to be started. 
6.2 Workloads 

To run the simulations, we chose the following workloads from the Parallel Workload Archive [8 1: 

1. LPC-EGEE0 (cleaned version), 

2. PIK-IPLE54E 

3. RICC0, 

4. SHARCNET-Whale0 

We selected traces that most closely resemble sequential workloads (in the selected traces most of the jobs 
require a single processor). We replaced non-sequential jobs that required k > 1 processors with k copies a 
sequential job having the same duration. 

In each workload, each job has an user identifier (in the workloads there are respectively 56, 225, 176 and 
154 distinct user identifiers). To distribute the jobs between the organizations we uniformly distributed the 

8 www.cs.huji. ac.il/labs/parallel/workload/l_lpc/index.html 
'www.cs.huji.ac.il/labs/parallel/workload/Lpik_iplex/index.html 
www.cs.huji.ac.il/labs/parallel/workload/l_iicc/index.html 
"www.cs.huji.ac.il/labs/parallel/workload71_sharcnet/index.html 
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Figure 4: The average delay (or speed up) of the job due to the unfairness of the algorithm Aip /p to t for 
different number of organizations. The 4 plots present experiments with different number of machines and 
different distributions of the machines between the organizations. 

user identifiers between the organizations; the job sent by the given user was assigned to the corresponding 
organization. 

The machines were distributed between the organizations with Zipf 's and uniform distributions. 



6.3 Results 

For each algorithm, we compared the vector of the utilities (the utilities per organization) at the end of the 
experiment (in time t en d)'- ip with the vector of the utilities in the ideally fair schedule tp* (the ideally fair 
schedule was determined by Algorithm [l). Let ptot denote the total number of the unit-size parts of the 
jobs completed in the fair schedule, p to t = Yl( s p)^a*-s<t d mm (P) tend — s). We calculated the difference 
Aip = — iP*\\m = So<") — ^W'*) and compared the values Aip/p tot for different algorithms. The 
value Ail) / ptot is the measure of the fairness that has intuitive interpretation. Since delaying each unit-size 
part of a job by one time moment decreases the utility of the job owner by one, the value Aip/p to t gives the 
average unjustified delay (or unjustified speed up) of each job due to the unfairness of the algorithm. 



6.3.1 Different number of organizations 

We run the experiments for the LPC-EGEE workload with the number of organizations varying between 2 
and 10. Since running the experiments for the whole trace was too computationally-intensive (we compared 
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Figure 5: The average delay (or speed up) of the job due to the unfairness of the algorithm Aip/ptot for 
different number of organizations. The 4 plots present experiments with different number of machines and 
different distributions of the machines between the organizations. 



our heristics with the exact exponential algorithm), we truncated the data to the first 5 million time moments 
(t end = 5 • 10 6 ). 

First, we run the experiments for 17 and 30 machines. The total processing time of all jobs is about 16 
times greater than 5 million (the duration of the experiment expressed in time moments); this means that the 
case of 17 machines corresponds to the scenario when the machines are almost continuously busy; in case 
of 30 machines the machines have, on the average, 50% utilization. Second, to test the scalability of our 
algorithms, we run the experiments for 170 and 300 machines. The release time of each job was multiplied 
by 0.1 to ensure the machines are not idle for most of the time (the processing times were not modified). 

The value Aip/ptot for different algorithms for the cases of for 17 and 30 machines is presented in 
Figure HI the same data for 170 and 300 machines is presented in Figure [5] These results show that: 

1. The round robin algorithm unfairy delays (or speeds up) an average job by 42120 time units (thus 
around 20 times the average job duration); and up to 6 • 10 5 (more than 10% of the simulated time 
period and about 295 times the average job duration). 

2. Both algorithms: Rand and DirectContr are reasonably fair, independently on the number of the 
organizations. The average values of Aip/p to t for the two algorithms are 79 and 480, respectively; 
thus around 5% and 25% of the average job duration. The maximal values of A^/Pfot are equal to 
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1767 and 25608, respectively. The algorithm Rand is more fair than DirectContr but it is also 
more computationally intensive. 

3. The randomized algorithm gives good quality even for a small number (N = 15) of sampled sub- 
coalitions; for N = 15 the average value of Aip/ptot is 91, while for TV = 75 it is 79. 

6.3.2 Different workloads 

We run the experiments for 4 different workloads. Because the experiments were computationally-intensive 
we run them only for 5 organizations and only for the uniform distribution of the machines between the 
organizations. For each workload we used the number of the machines that were originally used in the 
workload (that is 70, 2560, 8192 and 3072, respectively). We run the experiments on shorter traces (t en d = 
5 • 10 5 ). For LPC-EGEE data we observed that the value Ai/j /p to t is increasing during the experiments 
(indicating that the system is not in the steady-state), so the short experiments cannot be used for determining 
accurate bounds on Aip/p to t. However, in this part we are interested in comparing the relative fairness of 
the algorithms. 

For each workload we run 100 experiments (on different, non- overlapping periods of workloads of 
length 5 • 10 5 ). The average values of Aip/p to t, and the standard deviations are presented in Table 16-3721 





Rand (N = 15) 


DirectContr 


RoundRobin 


Avg 


St. dev. 


Avg 


St. dev. 


Avg 


St. dev. 


LPC-EGEE 


161 


435 


218 


659 


3774 


14244 


PIK-IPLEX 


10 


69 


7 


44 


28 


170 


RICC 


791 


1348 


1520 


3765 


7560 


9226 


SHARCNET-Whale 


72 


197 


177 


632 


538 


1254 



Table 1: The average delay (or speed up) of the job due to the unfairness of the algorithm Aifj/ptot for 
different algorithms and different workloads. Each row is an average over 100 instances taken as parts of 
the original workload. 

These results are consistent with the detailed results on the EGEE trace from the previous section. The 
Rand algorithm is on average 11 times more fair than the RoundRobin. The DirectContr algorithm 
is significantly more fair than the RoundRobin, but 1.85 times worse than Rand. The relative quality of 
the algorithms does not depend on the workload. 

7 Conclusions 

In this paper we define the fairness of the scheduling algorithm in terms of cooperative game theory which 
allows to quantify the impact of an organization on the system. We present a non-monetary model in which it 
is not required that each organization has accurate valuations of its jobs and resources. We show that classic 
utility functions may create incentives for workload manipulations. We thus propose a strategy resilient 
utility function that can be thought of as per-organization throughput. 

We analyze the complexity of the fair scheduling problem. The general problem is NP-hard and difficult 
to approximate. Nevertheless, the problem parametrized with the number of organizations is FPT Also, 
the FPT algorithm can be used as a reference for comparing the fairness of different algorithms on small 
instances. For a special case with unit-size jobs, we propose a FPRAS. In our experiments, we show that the 
FPRAS can used as the heuristic algorithm; we also show another efficient heuristic. 

Since we do not require the valuation of the jobs, and we consider an on-line, non-clairvoyant schedul- 
ing, we believe the presented results have practical consequences for real-life job schedulers. In our future 
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work we plan to use our fairness metric to experimentally assess standard scheduling algorithms, such as 
FCFS or fair-share. Also, we want to extend our model to parallel jobs. 
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